Exploring Feature-Level Duplications on Imbalanced Data Using Stochastic Diffusion Search

نویسندگان

  • Haya Abdullah Alhakbani
  • Mohammad Majid al-Rifaie
چکیده

Swarm intelligence mimics the behaviours of social insects like bees, wasps and ants to offer powerful problem solving metaheuristic which lies in a network of interactions amongst the agents of a multiagent system as well as with their environment. One of the computer algorithms inspired by swarm intelligence is the stochastic diffusion search (SDS). SDS uses some of the processes and techniques found in swarm to solve search and optimisation problems. In this paper, a hybrid approach is proposed to deal with real-world imbalanced data. The proposed model involves oversampling the minority class, undersampling the majority class as well as optimising the parameters of the classifier, Support Vector Machine (SVM). The proposed model uses Synthetic Minority Over-sampling Technique (SMOTE) to perform the oversampling and the agents of a swarm intelligence technique, SDS, to perform an ‘informed’ undersampling on the majority classes. The use of this swarm intelligence technique in conducting the undersampling tasks is investigated and its impact on improving the classification results is demonstrated. In addition to comparing the agents-led undersampling with random undersampling, the results are contrasted against other best known techniques on nine real-world datasets. Additionally, further experiments are designed to explore the behaviour of the SDS agents during the undersampling process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble-Based Wrapper Methods for Feature Selection and Class Imbalance Learning

The wrapper feature selection approach is useful in identifying informative feature subsets from high-dimensional datasets. Typically, an inductive algorithm “wrapped” in a search algorithm is used to evaluate the merit of the selected features. However, significant bias may be introduced when dealing with highly imbalanced dataset. That is, the selected features may favour one class while bein...

متن کامل

Stochastic Diffusion Search Review

Stochastic Diffusion Search (SDS), first incepted in 1989, belongs to the extended family of Swarm Intelligence algorithms. In contrast to many nature-inspired algorithms, SDS has a strong mathematical framework describing its behaviour and convergence. In addition to concisely exploring the algorithm in the context of natural swarm intelligence systems, this paper reviews the developments of t...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

A Statistical Study of two Diffusion Processes on Torus and Their Applications

Diffusion Processes such as Brownian motions and Ornstein-Uhlenbeck processes are the classes of stochastic processes that have been investigated by researchers in various disciplines including biological sciences. It is usually assumed that the outcomes of these processes are laid on the Euclidean spaces. However, some data in physical, chemical and biological phenomena indicate that they cann...

متن کامل

Attention through Self-Synchronisation in the Spiking Neuron Stochastic Diffusion Network

The paper discusses ensemble behaviour in the Spiking Neuron Stochastic Diffusion Network, SNSDN, a novel network exploring biologically plausible information processing based on higher order temporal coding. SNSDN was proposed as an alternative solution to the binding problem [1]. SNSDN operation resembles Stochastic Diffusion Search, SDS, a nondeterministic search algorithm able to rapidly lo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016